Integration Methods and Accelerated Optimization Algorithms
نویسندگان
چکیده
We show that accelerated optimization methods can be seen as particular instances of multi-step integration schemes from numerical analysis, applied to the gradient flow equation. In comparison with recent advances in this vein, the differential equation considered here is the basic gradient flow and we show that multi-step schemes allow integration of this differential equation using larger step sizes, thus intuitively explaining acceleration results. Introduction The gradient descent algorithm used to minimize a function f has a well-known simple numerical interpretation as the integration of the gradient flow equation, written x(0) = x0 ẋ(t) = −∇f(x(t)), (Gradient Flow) using Euler’s method. This appears to be a somewhat unique connection between optimization and numerical methods, since these two fields have inherently different goals. On one hand, numerical methods aim to get a precise discrete approximation of the solution x(t) on a finite time interval. More sophisticated methods than Euler’s were developed to get better consistency with the continuous time solution but still focus on a finite time horizon (see for example Süli and Mayers, 2003). On the other hand, optimization algorithms seek to find the minimizer of a function, which corresponds to the infinite time horizon of the gradient flow equation. Structural assumptions on f led to more sophisticated algorithms than the gradient method, such as the mirror gradient method (see for example Ben-Tal and Nemirovski, 2001; Beck and Teboulle, 2003), proximal gradient method (Nesterov et al., 2007) or a combination thereof (Duchi et al., 2010; Nesterov, 2015). Among them Nesterov’s accelerated gradient algorithm (Nesterov, 1983) is proven to be optimal on the class of smooth convex or strongly convex functions. This last method was designed with the lower complexity bounds in mind, but the proof relies on purely algebraic arguments and the key mechanism behind acceleration remains elusive, which led to various interpretations of it (Bubeck et al., 2015; Allen Zhu and Orecchia, 2017; Lessard et al., 2016). A recent stream of papers recently used differential equations to model the acceleration behavior and offer a better interpretation of Nesterov’s algorithm (Su et al., 2014; Krichene et al., 2015; Wibisono et al., 2016; Wilson et al., 2016). However, the differential equation is often quite
منابع مشابه
Using composite ranking to select the most appropriate Multi-Criteria Decision Making (MCDM) method in the optimal operation of the Dam reservoir
In this study, the performance of the algorithms of whale, Differential evolutionary, crow search, and Gray Wolf optimization were evaluated to operate the Golestan Dam reservoir with the objective function of meeting downstream water needs. Also, after defining the objective function and its constraints, the convergence degree of the algorithms was compared with each other and with the absolut...
متن کاملEnhanced ACS algorithms for plastic analysis of planar frames
In recent years, the trend in solving optimization problems has been directed toward using heuristic algorithms such as neural networks, genetic and ant colony algorithms. The main reason for this trend can be attributed to the fact that these algorithms can be efficiently adjusted to the specific search space to which they are applied and consequently they can be used for many optimization pro...
متن کاملA Multi Objective Optimization Approach for Resources Procurement of Bank
Calculating total cast of bank resources procurement methods which include current -free loan deposit, saving interest-free loan deposit, regular and net short-term investment deposit, long-term investment deposit and surety bond cash deposit and presenting their optimal integration require precise scientific studies. Hence, this study is an attempt to know which methods are the best optimal in...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملRelational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کامل